A Local and Global Discretization Method

نویسندگان

  • Yu Sang
  • Keqiu Li
  • Heng Qi
  • Yueting Zhu
چکیده

Most machine learning and data mining algorithms require that the training data contain only discrete attributes, which makes it necessary to discretize continuous numeric attributes. Bottom-up discretization algorithms are well-known methods. They mainly focus on discretizing data based on either local or global independence measure. In this paper, we present a novel bottom-up discretization method by combining local and global independence measures. First, we present a novel merging criterion that locally and globally captures the independence between the discretized attributes and the decision class; this is conducted by evaluating pairs of intervals with a proposed measure and developing a measurement of significance of interval pair among attributes. The advantage of our proposed merging criterion is further analyzed. Moreover, we develop an algorithm to find the best discretization based on the new merging criterion. Detailed analysis shows that the proposed method brings higher accuracy to the discretization process. Finally, we conduct extensive experimental results on 18 real-world datasets to evaluate the performance of the proposed method by comparison with existing methods. The experimental results show that the proposed method outperforms existing methods over the performance metrics considered. KeywordsMachine Learning; Data Mining; Discretization; Merging Criterion; Independence Measure

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Local and Global Approaches to Fracture Mechanics Using Isogeometric Analysis Method

The present research investigates the implementations of different computational geometry technologies in isogeometric analysis framework for computational fracture mechanics. NURBS and T-splines are two different computational geometry technologies which are studied in this work. Among the features of B-spline basis functions, the possibility of enhancing a B-spline basis with discontinuities ...

متن کامل

Numerical solution of unsteady flow on airfoils with vibrating local flexible membrane

  Unsteady flow separation on the airfoils with local flexible membrane (LFM) has been investigated in transient and laminar flows by the finite volume element method. A unique feature of the present method compared with the common computational fluid dynamic softwares, especially ANSYS CFX, is the modification using the physical influence scheme in convection fluxes at cell surfaces. In contr...

متن کامل

Global discretization of continuous attributes as preprocessing for machine learning

Real-life data usually are presented in databases by real numbers. On the other hand, most inductive learning methods require a small number of attribute values. Thus it is necessary to convert input data sets with continuous attributes into input data sets with discrete attributes. Methods of discretization restricted to single continuous attributes will be called local, while methods that sim...

متن کامل

A Novel Method for Content Base Image Retrieval Using Combination of Local and Global Features

Content-based image retrieval (CBIR) has been an active research topic in the last decade. In this paper we proposed an image retrieval method using global and local features. Firstly, for local features extraction, SURF algorithm produces a set of interest points for each image and a set of 64-dimensional descriptors for each interest points and then to use Bag of Visual Words model, a cluster...

متن کامل

A Novel Method for Content Base Image Retrieval Using Combination of Local and Global Features

Content-based image retrieval (CBIR) has been an active research topic in the last decade. In this paper we proposed an image retrieval method using global and local features. Firstly, for local features extraction, SURF algorithm produces a set of interest points for each image and a set of 64-dimensional descriptors for each interest points and then to use Bag of Visual Words model, a cluster...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014